Virtual columns

ABSTRACT

Techniques are described herein for performing column functions on virtual columns in database tables. A virtual column is defined by the database to contain results of a defining expression. Statistics are collected and maintained for virtual columns. Indexing is performed on virtual columns. Referential integrity is maintained between two tables using virtual columns as keys. Join predicate push-down operations are also performed using virtual columns.

RELATED APPLICATIONS

The present application is a divisional of U.S. patent application Ser.No. 11/951,890, entitled Virtual Columns, filed on Dec. 6^(th), 2007 bySubhransu Basu, et al., which is incorporated herein by reference. Thepresent application is related to U.S. patent application Ser. No.11/951,918, entitled Expression Replacement in Virtual Columns, filed bySubhransu Basu and Harmeek Singh Bedi on Dec. 6, 2007, which isincorporated herein by reference, and is related to U.S. patentapplication Ser. No. 11/951,933, entitled Partitioning on VirtualColumns, filed by Subhransu Basu, Harmeek Singh Bedi, and AnanthRaghavan on Dec. 6, 2007, which is incorporated herein by reference.

FIELD OF THE INVENTION

The present invention relates to database systems, and in particular, totechniques for representing, manipulating, and using columns andexpressions in database systems.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by wayof limitation, in the figures of the accompanying drawings and in whichlike reference numerals refer to similar elements and in which:

FIG. 1 illustrates an example of a table containing two columns and avirtual column.

FIG. 2 is a diagram of a computer system that may be used in animplementation of an embodiment of the present invention.

DETAILED DESCRIPTION

In the following description, for the purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of the present invention. It will be apparent, however,that the present invention may be practiced without these specificdetails. In other instances, well-known structures and devices are shownin block diagram form in order to avoid unnecessarily obscuring thepresent invention.

Introduction

In a database management system (DBMS), data is stored in one or moredata containers, each container contains records, and the data withineach record is organized into one or more fields. In relational DBMSs,the data containers are referred to as tables, the records are referredto as rows, and the fields are referred to as columns.

In a relational DBMS, many operations can be performed on columns intables. For example, statistics can be gathered on columns. Columns canalso be indexed. Furthermore, constraints can be defined on columns indifferent tables for the purpose of maintaining referential integritybetween the tables. Columns are also utilized in performing joinpredicate push-down operations. Finally, tables can be partitioned oncolumns, enabling optimizations such as partition-pruning andpartition-wise joins.

Data in columns is stored as part of a table in databases, consumingpermanent storage resources. To circumvent the need to store some data,a user may choose to compute the data only when it is needed for anoperation and discard the data once the operation has completed. Thiscomputation of data may be specified within an expression in a querylanguage like SQL. Expressions provide a way to compute data on demandwithout the need for permanent storage. According to an embodiment,expressions are also processed and optimized in the same manner ascolumns.

Virtual Columns An Example

According to one technique, virtual columns may be defined within a DBMSto facilitate the processing and manipulation of computed data. Avirtual column is a derived, or computed, column, which may or may notbe materialized on a disk. In other words, unlike a regular column,which is stored on a disk as part of a table, a virtual column may bestored for only a short period of time in temporary storage such asvolatile memory, and may be discarded at the end of an operation. Thevalues contained in a virtual column are derived by computation of anexpression or function and are computed on demand. Virtual columns canbe used in queries just like any other regular table columns, providinga simple, elegant, and consistent mechanism for accessing expressions ina SQL statement.

A virtual column is illustrated with the following DDL statement.

create table t1(c1 number,c2 number,c3 as(c1+c2))

When a DBMS receives and processes this statement, it defines andcreates a table t1, where columns c1 and c2 contain values of the numberdata type. Table t1 also includes a virtual column c3. When a DBMScreates a virtual column, such as c3, in response to receiving a DDLstatement such as the one above, the DBMS generates metadata definingthe virtual column as containing the results of an expression. A definedvirtual column need not contain any physical data. Rather, a virtualcolumn logically contains data values which are based on the results ofan expression.

Virtual column c3 is derived from the expression “c1+c2”, where c1 andc2 are regular columns in table t1. The values in a virtual columnconform to the data types in the underlying expression that describesthe virtual column. In this case, because c1 and c2 contain values ofthe number data type, c3 also necessarily contains values of the numberdata type. Table 100 in FIG. 1 illustrates an example illustrating atable t1 with example values in regular columns c1 and c2 andcorresponding computed values in virtual column c3. Table 100 containsfive rows. In the first row, for example, column c1 contains the value 1and column c2 contains the value 10. Therefore, virtual column c3, whichcontains values that are sums of values in columns c1 and c2, containsthe value 11 in the first row. Although virtual column c3 is shown asbeing in table t1, this is only for the purpose of illustration. Virtualcolumn c3 need not be stored on disk as part of table t1 like regularcolumns c1 and c2, and often will not be in order to conserve storageresources. When a query statement such as “select c3 of t1” is processedby a DBMS, the values of c3 may be computed dynamically based on valuesstored in regular columns c1 and c2 at the time of the computation.

Once a virtual column is defined, it may be referenced in SQL querieslike a regular column. For example, the following SQL statement uses thevirtual column c3 in a select statement.

select*from t1 where c3>30

The above statement selects all rows from the table t1 which contain avalue greater than 30 in the c3 column. Again, the values in c3 are notstored on disk, but are computed on demand by a DBMS. Therefore, when aDBMS executes the above statement, the sums of the values in regularcolumns c1 and c2 are calculated and compared to the value 30 indetermining which rows in t1 fulfill the query request.

For simplicity, in the examples in this disclosure, virtual column c3 isdescribed by the expression c1+c2. Significantly, a virtual column mayalso be derived from SQL functions and user-defined functions inaddition to expressions.

Collecting Statistics on Virtual Columns

As discussed above, relational DBMSs store data in tables. To retrievethis data, queries are submitted to a database server, which processesthe queries and returns the data requested. Users may use a databasequery language, such as SQL, to specify queries in a variety of ways.

Queries submitted to a database server are evaluated by a queryoptimizer. When a query optimizer evaluates a query, it generatesvarious “candidate execution plans” and estimates a query execution costfor each execution plan. The candidate execution plan with the lowestestimated query cost is assumed to be the most efficient and is thenselected by the query optimizer as the execution plan to be carried out.

Estimating a query cost can be very complex, and a query optimizer mayestimate cardinality (the number of rows to scan and process),selectivity (the fraction of rows from a row set filtered by apredicate), and cost in terms of resources such as disk input andoutput, CPU usage, and memory usage of the various candidate executionplans in the process of determining the most efficient execution planfrom several candidate execution plans.

To estimate selectivity, or how many rows from a table will satisfy apredicate, query optimizers utilize statistical data gathered on columnsin tables. Predicates in a database query language specify criteria forqueries. For example, a query in a candidate execution plan may requestall rows from a particular table which satisfy the predicate that thevalue in a row for a particular column is less than 4.

Q1=SELECT*

FROM t1

WHERE c1<4

Thus, the query Q1 requests all rows from table t1 which contain valuesgreater than 4 in column c1. When a query optimizer evaluates a querystatement like Q1, it utilizes column statistics to predict the numberof rows that will satisfy the predicate in Q1 (“WHERE c1<4”) withoutperforming the query itself. For example, column statistics for c1 maybe kept in the form of histograms that indicate a distribution of valuesin c1. Based on this statistical distribution, a query optimizer canquickly estimate how many rows satisfy the predicate “WHERE c1<4”without fetching all the rows in table t1 and examining c1 for every rowfetched (a process also known as a “full table scan”). Although table t1as illustrated in FIG. 1 contains only five rows, tables in typicalrelational databases may contain a much higher number of rows.Therefore, column statistics are desirable because a query optimizer canutilize these statistics to estimate the selectivity of predicates indifferent candidate execution plans without scanning a large number ofrows.

According to one technique, column statistics are collected andmaintained for virtual columns in the same manner that column statisticsare currently collected and maintained for regular columns. Collectingand maintaining column statistics for virtual columns allow a queryoptimizer to efficiently estimate query costs for queries which containvirtual columns. In the following example, virtual column c3, which hasbeen previously defined as “c1+c2”, is part of a predicate query Q2. Aquery optimizer may access column statistics for c3 to evaluate the costof query Q2.

Q2=SELECT*

FROM t1

WHERE c3<40

In contrast, in the following example, the expression “c1+c2” is part ofa predicate in query Q3. However, statistics cannot be collected andmaintained for the expression, and a query optimizer will not be able toquickly estimate the query cost of query Q3.

Q3=SELECT*

FROM t1

WHERE(c1+c2)<40

Thus, although storage resources are conserved in both query Q2 andquery Q3 because no column containing the sum of the values in columnsc1 and c2 have been saved to disk, the technique of collecting andmaintaining statistics for virtual columns allows query Q2 to beefficiently evaluated by a query optimizer.

Therefore, providing statistics support for virtual columns enable queryoptimizers to estimate the cost of queries containing virtual columnsjust as efficiently as queries containing regular table columns.

Indexing Virtual Columns

Indexes facilitate faster retrieval of data contained in databases. Adatabase index is conceptually similar to a normal index found at theend of a book, in that both kinds of indexes comprise an ordered list ofinformation accompanied with the location of the information. Values inone or more columns of a table are stored in indexes, which are storedand maintained separately from the table itself. The ordered list ofinformation in an index allows for quick scanning to find a target valueor range or values.

According to one technique, indexes are created for virtual columns inthe same manner as they are for regular columns. Users may alsospecifically request that an index be created for particular virtualcolumn(s). Once an index for a virtual column is created, the index ismaintained and accessed just like an index for a regular column. Thevarious techniques and schemes currently available for structuring andordering database indexes based on regular columns, such as bitmaps andfiltered indexes, are equally applicable to indexes for virtual columns.

In the example below, an index on virtual column c3 may be consulted toquickly retrieve the results for query Q4.

Q4=SELECT*

FROM t1

WHERE c3=33

Although table t1 as illustrated in FIG. 1 contains only five rows and afull-table scan can be quickly performed to retrieve the row containing“30” for c1 and “3” for c2, tables in typical relational databasescontaining a much higher number of rows. Utilizing indexes is especiallydesirable in cases where computations of expressions are very costly.The following example illustrates a case where indexing a virtual columncan significantly increase the computational efficiency of a query.

create table t2(c4 number,c5 number,c6 as(c4*c5))

Q5=SELECT*

FROM t2

WHERE c6>20

Table t2 contains a virtual column c6 which contains values that are theproduct of values in regular columns c4 and c5. Multiplication is acostly computation. When query Q5 is executed, the values in virtualcolumn c6 are computed and compared to the query predicate “WHEREc6>20”. If queries such as Q5, which require access to values in virtualcolumn c6, are common, then the frequent computation of “c4*c5” that isneeded to generate and store values for c6 will incur greatcomputational cost. In this case, it is desirable to index the values invirtual column c6 so that queries which require access to values in c6can directly access the index without the need to recompute, therebysaving a large amount of computational resources. Significantly, anindex may be created on virtual column c6 without the need tomaterialize c6 (i.e., store column c6 on disk as part of table t2).

Referential Integrity for Virtual Columns

Current database systems provide tools for maintaining referentialintegrity between two or more tables which are logically related. Thelogical relationships between two tables can be defined by users. Forexample, Table A contains information about employees and includes acolumn named “DepartmentNum”. Table B contains information aboutdepartments and also includes a column named “DepartmentNum”. A user maydefine that the two “DepartmentNum” columns are logically linked so thatrows in Table B may only contain values in the “DepartmentNum” columnwhich also exist in the “DepartmentNum” column in Table A. In such acase, the “DepartmentNum” column in Table A is referred to as the“primary key”, and the “DepartmentNum” column in Table B is referred toas the “foreign key”. Referential integrity is maintained when values ina column declared as the foreign key are limited to values in the columndeclared as the primary key.

A database system may enforce referential integrity in a variety ofways. One way of enforcing referential integrity requires that when aparticular value in the primary key column is removed, correspondingvalues in foreign key columns are also removed. For example, Table Acontains only one row where the value in the DepartmentNum column is100. The database system then receives a request to remove this row.Table B also contains a row which contains the value 100 in theDepartmentNum column. When the row containing 100 in the DepartmentNumcolumn in Table A is removed, the database automatically removes the rowcontaining 100 in the DepartmentNum column in Table B in order tomaintain referential integrity. Another way of enforcing referentialintegrity is to generate an error message when removing a row from aparticular table would break the referential integrity between theparticular table and another table. For example, a user may be informedby an error message that he is required to remove rows in tables thatdepend on a particular table before he can remove a row from theparticular table.

According to one technique, virtual columns may be used as primary keysfor the purpose of maintaining referential integrity. Significantly,even when a virtual column is used as a primary key, values in thevirtual column may be computed from the base columns on which thevirtual column depends and need not be stored on disk. In addition, achange to a value in a regular column from which a virtual columnderives automatically triggers a re-computation of the virtual column toreflect the most current values in the regular column.

By providing referential integrity for data values contained in virtualcolumns, users may define and use logical relationships between datacomputed from expressions without having to manually check that thelogical relationships between data remain intact.

Join Predicate Push-Down with Virtual Columns

As discussed above, query optimizers may generate several candidateexecution plans in order to determine which execution plan is mostefficient for a particular query. One method of generating a candidateexecution plan from a query is to transform the original query (alsoknown as a “base query”) by rewriting the query into a transformed querywhich can potentially be executed more efficiently. One way oftransforming a query is to use a “join predicate push-down”.

In a join predicate pushdown, a join predicate from an outer query thatreferences a column of a view of an outer query is “pushed down” into aview. Join predicate pushdown is illustrated with the following basequery QA.

QA=SELECT T1.C,T2.x

FROM T1,T2,(SELECT T4.x,T3.y

FROM T4,T3

WHERE T3.p=T4.q and T4.k>4)V

WHERE T1.c=T2.d and T1.x=V.x(+)and

T2.f=V.y(+);

Query QA includes view V. V is the alias or label for the subqueryexpression (SELECT T4.x, T3.y FROM T4, T3 WHERE T3.p=T4.q and T4.k>4).The subquery expression is referred to herein as a view because it is asubquery expression among an outer query's FROM list items and can betreated, to a degree, like a view or table. Other tables listed in theFROM list are referred to herein as outer tables with respect to theouter query and/or the view. With respect to the view V, Tables T1 andT2 are outer tables, while tables T3 and T4 are not.

Under join predicate pushdown, query QA is transformed to query QA′ asfollows.

QA′=SELECT T1.C,T2.x

FROM T1,T2,(SELECT T4.x,T3.y

FROM T4,T3

WHERE T3.p=T4.q and T4.k>4 and

T1.x=T4.x and T2.f=T3.y)V

WHERE T1.c=T2.d;

The join predicate T1.x=V.x (+) of the outer query is pushed down intoview V by rewriting the view V to include the join predicate T1.x=T4.x.T4.x is the equivalent column of the view V.x. Similarly, the joinpredicate T2.f=V.y (+) is pushed down into the view. The pushed downjoin predicates do not specify outer-join notation; the outer-join isinternally represented by the table being outer-joined.

A pushed-down predicate opens up new access paths, which are exploitedto form candidate execution plans that may more efficiently compute aquery. For example, a candidate execution plan may compute the joinbased on join predicate T2.d=T3.y in QA′ using an index on either T2.for T3.y in an index nested-loops join, which is not possible withoutthis transformation.

According to one technique, join predicate pushdown may be performed onqueries which contain virtual columns in outer queries. Significantly,this enables expressions to be pushed down into a view.

Hardware Overview

FIG. 2 is a block diagram that illustrates a computer system 200 uponwhich an embodiment of the invention may be implemented. Computer system200 includes a bus 202 or other communication mechanism forcommunicating information, and a processor 204 coupled with bus 202 forprocessing information. Computer system 200 also includes a main memory206, such as a random access memory (RAM) or other dynamic storagedevice, coupled to bus 202 for storing information and instructions tobe executed by processor 204. Main memory 206 also may be used forstoring temporary variables or other intermediate information duringexecution of instructions to be executed by processor 204. Computersystem 200 further includes a read only memory (ROM) 208 or other staticstorage device coupled to bus 202 for storing static information andinstructions for processor 204. A storage device 210, such as a magneticdisk or optical disk, is provided and coupled to bus 202 for storinginformation and instructions.

Computer system 200 may be coupled via bus 202 to a display 212, such asa cathode ray tube (CRT), for displaying information to a computer user.An input device 214, including alphanumeric and other keys, is coupledto bus 202 for communicating information and command selections toprocessor 204. Another type of user input device is cursor control 216,such as a mouse, a trackball, or cursor direction keys for communicatingdirection information and command selections to processor 204 and forcontrolling cursor movement on display 212. This input device typicallyhas two degrees of freedom in two axes, a first axis (e.g., x) and asecond axis (e.g., y), that allows the device to specify positions in aplane.

The invention is related to the use of computer system 200 forimplementing the techniques described herein. According to oneembodiment of the invention, those techniques are performed by computersystem 200 in response to processor 204 executing one or more sequencesof one or more instructions contained in main memory 206. Suchinstructions may be read into main memory 206 from anothermachine-readable medium, such as storage device 210. Execution of thesequences of instructions contained in main memory 206 causes processor204 to perform the process steps described herein. In alternativeembodiments, hard-wired circuitry may be used in place of or incombination with software instructions to implement the invention. Thus,embodiments of the invention are not limited to any specific combinationof hardware circuitry and software.

The term “machine-readable medium” as used herein refers to any mediumthat participates in providing data that causes a machine to operationin a specific fashion. In an embodiment implemented using computersystem 200, various machine-readable media are involved, for example, inproviding instructions to processor 204 for execution. Such a medium maytake many forms, including but not limited to, non-volatile media,volatile media, and transmission media. Non-volatile media includes, forexample, optical or magnetic disks, such as storage device 210. Volatilemedia includes dynamic memory, such as main memory 206. Transmissionmedia includes coaxial cables, copper wire and fiber optics, includingthe wires that comprise bus 202. Transmission media can also take theform of acoustic or light waves, such as those generated duringradio-wave and infra-red data communications. All such media must betangible to enable the instructions carried by the media to be detectedby a physical mechanism that reads the instructions into a machine.

Common forms of machine-readable media include, for example, a floppydisk, a flexible disk, hard disk, magnetic tape, or any other magneticmedium, a CD-ROM, any other optical medium, punchcards, papertape, anyother physical medium with patterns of holes, a RAM, a PROM, and EPROM,a FLASH-EPROM, any other memory chip or cartridge, a carrier wave asdescribed hereinafter, or any other medium from which a computer canread.

Various forms of machine-readable media may be involved in carrying oneor more sequences of one or more instructions to processor 204 forexecution. For example, the instructions may initially be carried on amagnetic disk of a remote computer. The remote computer can load theinstructions into its dynamic memory and send the instructions over atelephone line using a modem. A modem local to computer system 200 canreceive the data on the telephone line and use an infra-red transmitterto convert the data to an infra-red signal. An infra-red detector canreceive the data carried in the infra-red signal and appropriatecircuitry can place the data on bus 202. Bus 202 carries the data tomain memory 206, from which processor 204 retrieves and executes theinstructions. The instructions received by main memory 206 mayoptionally be stored on storage device 210 either before or afterexecution by processor 204.

Computer system 200 also includes a communication interface 218 coupledto bus 202. Communication interface 218 provides a two-way datacommunication coupling to a network link 220 that is connected to alocal network 222. For example, communication interface 218 may be anintegrated services digital network (ISDN) card or a modem to provide adata communication connection to a corresponding type of telephone line.As another example, communication interface 218 may be a local areanetwork (LAN) card to provide a data communication connection to acompatible LAN. Wireless links may also be implemented. In any suchimplementation, communication interface 218 sends and receiveselectrical, electromagnetic or optical signals that carry digital datastreams representing various types of information.

Network link 220 typically provides data communication through one ormore networks to other data devices. For example, network link 220 mayprovide a connection through local network 222 to a host computer 224 orto data equipment operated by an Internet Service Provider (ISP) 226.ISP 226 in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the“Internet” 228. Local network 222 and Internet 228 both use electrical,electromagnetic or optical signals that carry digital data streams. Thesignals through the various networks and the signals on network link 220and through communication interface 218, which carry the digital data toand from computer system 200, are exemplary forms of carrier wavestransporting the information.

Computer system 200 can send messages and receive data, includingprogram code, through the network(s), network link 220 and communicationinterface 218. In the Internet example, a server 230 might transmit arequested code for an application program through Internet 228, ISP 226,local network 222 and communication interface 218.

The received code may be executed by processor 204 as it is received,and/or stored in storage device 210, or other non-volatile storage forlater execution. In this manner, computer system 200 may obtainapplication code in the form of a carrier wave.

In the foregoing specification, embodiments of the invention have beendescribed with reference to numerous specific details that may vary fromimplementation to implementation. Thus, the sole and exclusive indicatorof what is the invention, and is intended by the applicants to be theinvention, is the set of claims that issue from this application, in thespecific form in which such claims issue, including any subsequentcorrection. Any definitions expressly set forth herein for termscontained in such claims shall govern the meaning of such terms as usedin the claims. Hence, no limitation, element, property, feature,advantage or attribute that is not expressly recited in a claim shouldlimit the scope of such claim in any way. The specification and drawingsare, accordingly, to be regarded in an illustrative rather than arestrictive sense.

1. A computer-implemented method for maintaining referential integritybetween two tables, comprising: receiving a designation that a firstcolumn in a first table is a primary key, wherein the first column is avirtual column, wherein a DBMS defines the virtual column as containingresults of an expression, and wherein a virtual column comprises atleast one data value computed from an expression or a function;receiving a designation that a second column in a second table is aforeign key; and maintaining referential integrity between data valuesin the first column and data values in the second column.
 2. Thecomputer-implemented method of claim 1, wherein the step of maintainingreferential integrity comprises: removing, from the second table, anyrows whose value in the second column does not correspond to a value inthe first column.
 3. The computer-implemented method of claim 1, whereinthe step of maintaining referential integrity comprises: receiving arequest to add a row to the second table, determining whether adding therow to the second table results in adding a value to the second columnwhere the value does not correspond to a value in the first column, inresponse to determining that adding the row to the second table resultsin adding a value to the second column where the value does notcorrespond to a value in the first column, declining to process therequest to add the row to the second table.
 4. The computer-implementedmethod of claim 3, wherein the step of declining to process the requestfurther comprises generating an error message.
 5. A computer-implementedmethod, comprising: generating a transformed query based on a particularquery, wherein said particular query includes: an outer query; a viewwithin a FROM list of the outer query; a join predicate that references:a virtual column of an outer table of the outer query, and a columnreturned by the view, wherein a DBMS defines the virtual column ascontaining results of an expression, and wherein a virtual columncomprises at least one data value computed from an expression or afunction; wherein generating the transformed query includes pushing downthe join predicate to create a pushed down join predicate thatreferences the column of the outer table and a certain column returnedby the view is based.
 6. A non-transitory computer-readable storagemedium storing instructions for maintaining referential integritybetween two tables, the instructions including instructions which, whenexecuted by one or more processors, cause the one or more processors toperform the steps of: receiving a designation that a first column in afirst table is a primary key, wherein the first column is a virtualcolumn, wherein a DBMS defines the virtual column as containing resultsof an expression, and wherein a virtual column comprises at least onedata value computed from an expression or a function; receiving adesignation that a second column in a second table is a foreign key; andmaintaining referential integrity between data values in the firstcolumn and data values in the second column.
 7. The non-transitorycomputer-readable storage medium of claim 6, wherein the step ofmaintaining referential integrity comprises: removing, from the secondtable, any rows whose value in the second column does not correspond toa value in the first column.
 8. The non-transitory computer-readablestorage medium of claim 6, wherein the step of maintaining referentialintegrity comprises: receiving a request to add a row to the secondtable, determining whether adding the row to the second table results inadding a value to the second column where the value does not correspondto a value in the first column, in response to determining that addingthe row to the second table results in adding a value to the secondcolumn where the value does not correspond to a value in the firstcolumn, declining to process the request to add the row to the secondtable.
 9. The non-transitory computer-readable storage medium of claim8, wherein the step of declining to process the request furthercomprises generating an error message.
 10. A non-transitorycomputer-readable storage medium storing instructions, the instructionsincluding instructions which, when executed by one or more processors,cause the one or more processors to perform the steps of: generating atransformed query based on a particular query, wherein said particularquery includes: an outer query; a view within a FROM list of the outerquery; a join predicate that references: a virtual column of an outertable of the outer query, and a column returned by the view, wherein aDBMS defines the virtual column as containing results of an expression,and wherein a virtual column comprises at least one data value computedfrom an expression or a function; wherein generating the transformedquery includes pushing down the join predicate to create a pushed downjoin predicate that references the column of the outer table and acertain column returned by the view is based.